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Abstract 

The constitutional assignment of natural products by NMR spectroscopy is usually based on 2D NMR experiments 
like COSY, HSQC, and HMBC. The actual difficulty of the structure elucidation problem depends more on the type 
of the investigated molecule than on its size. The moment HMBC data is involved in the process or a large 
number of heteroatoms is present, a possibility of multiple solutions fitting the same data set exists. A structure 
elucidation software can be used to find such alternative constitutional assignments and help in the discussion in 
order to find the correct solution. But this is rarely done. This article describes the use of theoretical NMR 
correlation data in the structure elucidation process with WEBCOCON, not for the initial constitutional assignments, 
but to define how well a suggested molecule could have been described by NMR correlation data. The results of 
this analysis can be used to decide on further steps needed to assure the correctness of the structural assignment. 
As first step the analysis of the deviation of carbon chemical shifts is performed, comparing chemical shifts 
predicted for each possible solution with the experimental data. The application of this technique to three well 
known compounds is shown. Using NMR correlation data alone for the description of the constitutions is not 
always enough, even when including 13 C chemical shift prediction. 



Findings 

Nuclear Magnetic Resonance allied with Elemental ana- 
lysis or high resolution Mass Spectroscopy are the most 
common tools used for the structure elucidation of new 
compounds. The used 2D NMR experiments like COSY, 
HSQC, and 13 C-HMBC deliver correlation information 
between atoms that can be translated into connectivity 
information. Out of these, correlation information from 
COSY and HSQC experiments can be transcribed 
directly into connectivity between atoms. But the 13 C- 
HMBC correlations need more attention because of 
their ambiguity and complexity. Hence the difficulty of 
the structure elucidation problem depends more on the 
type of the investigated molecule than on its size [1], 
Saturated compounds can usually be assigned unam- 
biguously using mainly COSY and some 13 C-HMBC 
data, whereas condensed heterocycles are problematic 
due to their lack of protons that could show interatomic 
connectivities. This ambiguity has driven the develop- 
ment of different software packages to aid in the inter- 
pretation of the 13 C-HMBC correlation data [2-20] as 
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much as the development of additional correlation 
experiments [21,22]. 

Most of these approaches have in common that they 
work only based on experimental NMR correlation data. 
COCON [1,4,23,24] has recently been extended with the 
capability to create a theoretical NMR correlation data 
set, based on a molecule's suggested constitution. The 
theoretical data set is used as input data for the struc- 
ture elucidation software COCON. The resulting set of 
constitutional assignments indicates how unambiguous 
NMR would have been able to describe the originally 
suggested molecule. The freely accessible online version 
of COCON (WEBCOCON at http://cocon.nmr.de) 
offers this analysis as "Alternative Constitutions". 

The data derived from the NMR correlation spectra is 
the result of magnetization transfer via scalar coupling 
between the atoms in the molecule of interest. Since the 
scalar coupling is based on the interatomic bonds, the 
correlation data will reflect those bonds. Hence, a set of 
all feasible NMR correlation data (theoretical correlation 
data) can be derived from the molecular constitution. 
This is done by iteratively looking for all protons in the 
molecule, then building a list of their atoms in 2-bond 
and 3-bond distance. From each proton all 
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connectivities are inspected recursively up to three 
bonds distance. If a carbon is found in a two bond dis- 
tance, a 2 J and a 1,1-ADEQUATE correlation are added 
to the list. If a carbon is found in a three bond distance, 
a HMBC correlation is added to the list, if a proton is 
found, a COSY correlation is added. In principle 4 J cor- 
relations for COSY and HMBC could be generated, as 
sometimes they are observable in experiments as well. 
But, COCON can not handle 4 / COSY correlations, 
therefore those are left out. The generation of / HMBC 
correlations is not used, because when the HMBC cor- 
relations are allowed to be / in the structure generation 
process, the process takes much more time and many 
more results are produced. Finally carbon chemical 
shifts are generated by table lookup, a table reverse gen- 
erated based on the chemical shift rules that COCON 
uses. This values are not comparable to a chemical shift 
prediction, but enough to ensure that COCON will gen- 
erate the starting structure. 

For online use, the MarvinSketch applet from Che- 
mAxon is available for drawing or loading of the mole- 
cule. The resulting MDL file contains all atoms, their 
connectivity and multiplicity information. Based on this 
file, the recently developed Module "Alternative Consti- 
tutions" in WEBCOCON generates atomtypes, theoreti- 
cal correlation data and table-based carbon chemical 
shifts. 

The actual magnitude of the scalar coupling, and 
therefore the observability of a correlation, depends on 
the atoms involved, their chemical environment and 
relative geometry. For 1 J and 2 J couplings mainly the 
atoms involved and their chemical environment are of 
importance, since the geometry varies little. That is dif- 
ferent with / coupling, which depends on the dihedral 
angle, hence the actual molecular conformation decides 
on the magnitude of the coupling. The creation of theo- 
retical correlation data disregards the molecule's real 
conformation, assuming that all correlations are obser- 
vable. Hence the data set represents the upper limit of 
correlations that may be experimentally available for the 
constitution. 

Calculations were run with three molecules (Figure 1) 
on the publicly available WEBCOCON server, running 
times varied from one to twelve minutes. All molecules 
were drawn in the "Alternative Constitutions" module 
and submitted to the server. The number of solutions 
suggested for Ascomycin 1 and Oroidin 2 in runs with 
theoretical and experimental data are shown in table 1. 
Also, a webpage allowing direct access to the results 
shown here has been set up on the WEBCOCON server 
at http://cocon.nmr.de/StructureDiscussion/ (The results 
are mirrored at http://science.jotjot.net/StructureDiscus- 
sion/). 




0_ (1) (3) 

Figure 1 Ascomycin 1, Oroidin 2 and Aflatoxin B1 3 are used 
to evaluate the use of theoretical data 



Ascomycin 1 is a well known ethyl derivative of 
Tacrolimus, it serves as example of a large natural pro- 
duct, featuring 43 Carbon atoms. Using theoretical 
NMR correlation data (COSY and 13 C-HMBC correla- 
tions) COCON generates only one solution, independent 
of whether atom types are defined or not. Using experi- 
mental COSY and 13 C-HMBC correlation data the 
structure generator comes up with 100 structural assign- 
ments, which are reduced to one when the atom types 
are fixed as well. In this case NMR correlation data was 
able to define the constitution unambiguously. 

Oroidin 2 has been frequently used for the demon- 
stration of COCON. The use of theoretical COSY and 
13 C-HMBC correlations leads to a total of 16 possible 
constitutional assignments, also predefining the atom 
types reduces this set to one constitutional assignment. 
The experimental data set leads to 252,566 structural 
assignments generated, which reduce to 1,486 when 
atom types are predefined as well. Hence the structure 
can not be safely determined by NMR alone. The origi- 
nal structure determination was carried out by chemical 
derivatization and total synthesis [25,26]. 

The pictures change with Aflatoxin Bl 3 with 17 Car- 
bon atoms. Using theoretical COSY and 13 C-HMBC 
data alone, COCON generates 1,048 structures, com- 
pared to 1,932 solutions using experimental data. When 
the atom types are predefined, COCON generates 55 
constitutional assignments, compared to 108 with 
experimental data. The molecule set generated contains 
constitutions with the element cyclobutadiene, a 



Table 1 Number of constitutional assignments suggested 
for 1 and 2. 
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structural element that is very uncommon in natural 
products. COCON has several built-in rules that elimi- 
nate certain constitutional elements, like cyclobutadiene, 
cyclopropene and peroxides. By default these rules are 
not used, but in this special case we observed a substan- 
tial difference in the number of results. 

When these rules are activated the number of solu- 
tions drops to 58 for the experimental correlation data 
set and 33 for the theoretical data set. All planar mole- 
cules suggested are shown in Figure 2, the correct con- 
stitution and starting point of the analysis is 6. For the 
small number of interesting constitutions a back-calcu- 
lation on the carbon chemical shifts was made (Chem- 
Draw vll), that were compared to the experimental 
values (see table 2). The last line in the table contains 
the sum of the absolute chemical shift differences for all 
carbons, exposing molecule 6 as the one that best fits 
the experimental data [24,27,28]. 

The theoretical NMR correlation dataset is the 
upper limit of number of correlations that are possi- 
ble with a given constitution. Therefore all alternative 
constitutions generated with this data are "NMR- 
identical" with regard to correlation data. A careful 
analysis of this alternatives might be used to direct 
further investigations needed to confirm the proposed 
constitution. Whilst Ascomycin's structure can be 
confirmed by NMR correlations, Oroidin's structure 
can not. The results obtained would direct further 
work towards chemical derivatization and synthesis 
[25,26] or x-ray crystallography. The results obtained 
for Aflatoxin Bl show nicely how carbon chemical 



Table 2 Experimental and predicted 13 C chemical shifts 
for the different constitutions suggested for Aflatoxin Bl . 
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shift prediction can be used as tool for the structure 
discussion, exposing one suggested constitutional 
assignment as best fitting. 

Availability 

The WEBCOCON server is freely accessible via http:// 
cocon.nmr.de. 
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